首页> 外文OA文献 >Potential benefits of a block-space GPU approach for discrete tetrahedral domains
【2h】

Potential benefits of a block-space GPU approach for discrete tetrahedral domains

机译:离散空间的块空间GpU方法的潜在好处   四面体域

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The study of data-parallel domain re-organization and thread-mappingtechniques are relevant topics as they can increase the efficiency of GPUcomputations when working on spatial discrete domains with non-box-shapedgeometry. In this work we study the potential benefits of applying a succintdata re-organization of a tetrahedral data-parallel domain of size$\mathcal{O}(n^3)$ combined with an efficient block-space GPU map of the form$g:\mathbb{N} \rightarrow \mathbb{N}^3$. Results from the analysis suggest thatin theory the combination of these two optimizations produce significantperformance improvement as block-based data re-organization allows a coalescedone-to-one correspondence at local thread-space while $g(\lambda)$ produces anefficient block-space spatial correspondence between groups of data and groupsof threads, reducing the number of unnecessary threads from $O(n^3)$ to$O(n^2\rho^3)$ where $\rho$ is the linear block-size and typically $\rho^3 \lln$. From the analysis, we obtained that a block based succint datare-organization can provide up to $2\times$ improved performance over a lineardata organization while the map can be up to $6\times$ more efficient than abounding box approach. The results from this work can serve as a useful guidefor a more efficient GPU computation on tetrahedral domains found in spinlattice, finite element and special n-body problems, among others.
机译:数据并行域重组和线程映射技术的研究是相关主题,因为当在具有非盒形几何形状的空间离散域上工作时,它们可以提高GPU计算的效率。在这项工作中,我们研究了应用大小为\\ mathcal {O}(n ^ 3)$的四面体数据并行域的succintdata重新组织与有效的块形式g的块空间GPU映射相结合的潜在好处:\ mathbb {N} \ rightarrow \ mathbb {N} ^ 3 $。分析的结果表明,理论上这两种优化的组合可显着提高性能,因为基于块的数据重组允许在本地线程空间上实现一对一的对应关系,而$ g(\ lambda)$则生成有效的块空间数据组和线程组之间的空间对应关系,将不必要的线程数从$ O(n ^ 3)$减少到$ O(n ^ 2 \ rho ^ 3)$,其中$ \ rho $是线性块大小,通常是$ \ rho ^ 3 \ lln $。通过分析,我们得出,基于块的简洁数据重组可以提供比线性数据组织高2倍的性能提升,而地图的效率则比丰富的盒子方法高6倍。这项工作的结果可以作为有用的指南,用于对在纺丝晶格,有限元和特殊n体问题等中发现的四面体域进行更高效的GPU计算。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号